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Abstract 

For discrete-time stochastic processes, there is a close connection between 
return/waiting times and entropy. Such a connection cannot be straightfor¬ 
wardly extended to the continuous-time setting. Contrarily to the discrete-time 
case one does need a reference measure and so the natural object is relative en¬ 
tropy rather than entropy. In this paper we elaborate on this in the case of 
continuous-time Markov processes with finite state space. A reference measure 
of special interest is the one associated to the time-reversed process. In that 
case relative entropy is interpreted as the entropy production rate. The main 
results of this paper are: almost-sure convergence to relative entropy of suitable 
waiting-times and their fluctuation properties (central limit theorem and large 
deviation principle). 


1 Introduction 

Many limit theorems in the theory of stochastic processes have a version for discrete¬ 
time as well as for continuous-time processes. The ergodic theory of Markov chains 
e.g. is more or less identical in discrete and in continuous time. The same holds for the 
Ergodic Theorem, martingale convergence theorems, central limit theorems and large 
deviations for additive functionals, etc. Usually, one obtains the same results with 
some additional effort in the continuous-time setting, where e.g. extra measurability 
issues can show up. 
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For discrete-time ergodic processes, there is a remarkable theorem connecting re¬ 
currence times and entropy [§]. In words, it states that the logarithm of the first time 
the process repeats its first n symbols typically behaves like n times the entropy of the 
process. This provides a way to sample entropy observing a single, typical trajectory 
of the process. This result seems a natural candidate to transport to a continuous¬ 
time setting. The relation between entropy and return times is sufficiently intuitive 
so that one would not expect major obstacles on the road toward such a result for 
continuous-time ergodic processes. There is however one serious problem. On the 
path space of continuous-time processes (on a finite state space, say), there is no nat¬ 
ural flat measure. In the discrete-time setting one cannot distinguish between entropy 
of a process and relative entropy between the process and the uniform measure on 
trajectories. These only differ by a constant and a minus sign. As we shall see below, 
this difference between relative entropy and entropy does play an important role in 
turning to continuous-time processes. Therefore, there will be no continuous-time 
analogue of the relation between return times and “entropy”. In fact, the logarithm 
of return times turns out to have no suitable way of being normalized, even for very 
simple processes in continuous time such as Markov chains. To circumvent this draw¬ 
back, we propose here is to consider differences of the logarithm of suitable waiting 
times and relate them to relative entropy. 

Another aspect of our approach is to first discretize time, and show that the 
relation between waiting times and relative entropy persists in the limit of vanishing 
discrete time-step. From the physical point of view, the time-step of the discretization 
is the acquisition frequency of the device one uses to sample the process. We can also 
think of numerical simulations for which the discretization of time is unavoidable. 
Of course, the natural issue is to verify if the results obtained with the discretized 
process give the correct ones for the original process, after a suitable rescaling and by 
letting the time-step go to zero. This will be done in the present context. 

In this paper, we will restrict ourselves to continuous-time Markov chains with 
finite-state space for the sake of simplicity and also because the aforementioned prob¬ 
lem already appears in this special, yet fundamental, setting. The main body of this 
paper is: a law of large numbers for the difference of the logarithm of certain waiting 
times giving a suitable relative entropy, a large deviation result and a central limit 
theorem. One possible application is the estimation of the relative entropy density 
between the forward and the backward process which is physically interpreted as the 
mean entropy production , and which is strictly positive if and only if the process is 
reversible (i.e., in “detailed balance”, or “equilibrium”). 

Our paper is organized as follows. In Section [3 we show why the naive generaliza¬ 
tion of the Ornstein-Weiss theorem fails. Section 01 contains the main results about 
law of large numbers, large deviations and central limit theorem for the logarithm of 
ratios of waiting times. In the final section we consider the problem of “shadowing” a 
given continuous-time trajectory drawn from an ergodic distribution on path space. 
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2 Naive approach 


In this section we start with an informal discussion motivating the quantities which 
we will consider in what follows. Let {X t ,t > 0} be a continuous-time Markov chain 
with state space A, with stationary measure /i, and with generator 

Lf{x ) = ^2c(x)p(x,y)(f(y) - f(x )) 

yGA 


where p(x, y ) is a transition probability of a discrete-time irreducible Markov chain 
on A, with p(x, x) = 0, and where the escape rates c(x) are strictly positive. Given 
a time-step 5, we can discretize the Markov chain to obtain its “5-discretization” 
{Xis,i = 0,1,2,...}. Next we define the first time the 5-discretized process repeats 
its first n symbols via the random variable 

Rh{X) := infjfc > 1 : (A" 0 ,..., A '( n -i)s) = (X k $, ..., X( fc+n _!) 5 )} . (2.1) 

The analogue of the Ornstein-Weiss theorem !|J| for this continuous-time process would 
be a limit theorem for a suitably normalized version of log Rf for n —> oo, 5 —> 0. How¬ 
ever, for 5 > 0 fixed, the ergodicity of the 5-discretization {X 0 , X$, X 2 s, • • •, X n s ,.. 
Ornstein-Weiss and Shannon-McMillan-Breiman theorems 0] yield 

— log [i?^(A)P(Af)] = o(l) eventually a.s. as n —> oo . 

Using the fact that X is an ergodic Markov chain we obtain 


- - log R 5 n {X) = - log /i(A 0 ) + - 

n 7i 7i 


'n—l 


X] lo g pf( x i,x i+1 ) +o(l) 


, i =1 


= E [logp^(X 0 , Ah)] + o(l) eventually a.s. as 77 , —> oo (2.2) 


where E denotes expectation in the Markov chain started from its stationary distri¬ 
bution and where pf denotes the transition probability of the 5-discretized Markov 
chain {A^, i — 0,1, 2,...}, i.e., 

pf (x, y ) = ( e SL ) xy = I xy (l - 5c(x )) + 8c(x)p(x, y) + 0(8 2 ). 


Therefore, 

- lim -log^(A) = 

n—xx) 77, 

yf p(x)(l — 8c(x)) log(l — Sc(x )) + p(x)Sc(x)p(x, y) log(5c(a:)p(x, y)) + 0(5 2 ). 

x GA x,y£A 


In this expression we see that the first term is of order 5 whereas the second one 
is of order 5 log 5. Therefore, this expression does not seem to have a natural way 
to be normalized. This is a typical phenomenon for continuous-time processes: we 
need a suitable reference process in order to define “entropy” as “relative entropy” 
with respect to this reference process. Indeed, as we will see in the next sections, by 
considering differences of waiting times one is able to cancel the 5 log 5 term in order 
to obtain expression that makes sense in the limit 5 j 0. 
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3 Main results: waiting times and relative entropy 

We consider continuous-time Markov chains with a finite state-space A. We will 
always work with irreducible Markov chains with a unique stationary distribution. 
The process is denoted by {X t : t > 0}. The associated measure on path space 
starting from X 0 = x is denoted by P x and by P we denote the path space measure 
of the process started from its unique stationary distribution. For t > 0, T t denotes 
the sigma-field generated by X s , s <t, and P^ denotes the measure P restricted to 
Tt- 


3.1 Relative entropy: comparing two Markov chains 

Consider two continuous-time Markov chains, one denoted by [X t : t > 0} with 
generator 

Lf(x) = c(x)p(x , y) ( f(y) - f(x)) (3.1) 

y£A 

and the other denoted by {Y t : t > 0} with generator 

Lf{x ) = Y d ( x M x ,y)(f(y) - /O)) 

yGA 

where p(x, y) is the Markov transition function of an irreducible discrete-time Markov 
chain. We further assume that p(x,x) = 0, and c(x) > 0 for all x G A. We suppose 
that Xo, resp. Y 0 , is distributed according to the unique stationary measure p, resp 
/i so that both processes are stationary and ergodic. 

Remark 1. The fact that the Markov transition function p(x, y) is the same for both 
processes is only for the sake of simplicity. All our results can be reformulated in the 
case that the Markov transition functions would be different. 


We recall Girsanov’s formula |S]: 


dpiM 

dipio.t] 


M 


Mwo) 

/qo-'o) 


exp 


log ^4 dN s (u) 


i o 


c(uj s ) 



c(u s )) ds 


(3.3) 


where N s (uj ) is the number of jumps of the path u up to time s. The relative entropy 
of P w.r.t. P up to time t is defined as 


s t (P|P) 


j cflP(u;) log 


/dp[°- t] 



(3.4) 


Using (El and stationarity, we obtain 


lim 

t—XX) 


5t(P|P) 

t 


V n(x)c(x) log 44 - y Mx)(c(x) - c(x)) 
s(P|P) 


(3.5) 
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where s(P|P) is the relative entropy (per unit time) of P with respect to P. We refer 
to s, cn for more details on relative entropy for continuous-time Markov chains. 
Notice also that, by ergodicity, 

1 dP 10 ’* 1 

lim - log —-—(cu) = s(P|P) P — a.s.. 

i->oo t dPio.*] v v 1 ' 

In the case {Y t : t > 0} is Markov chain with generator 


Lf (x) = J ^c(x)p(x,y)(f(y ) - f(x)) 

yGA 

(13.51) generalizes to 

s(P|P) = Y n{x)c{x)p{x , y) log - Y M( x )( c ( x ) - 5 ( x )) ( 3 - 6 ) 

x, y eA (\x)p\x.y) x&A 


A important particular case is met when {Y t : t > 0} is the time-reversed process of 
{Ah : t > 0}, he., 

(Y t )o<t<T = (-AT-t)o <t<T hi distribution . 

This is a Markov chain with transition rates 


c(x,y) = c(x) 


c(y)p(y,x)p(y) 


c(x)/u(x) 

In that particular situation, the random variable 

dP[°’ T l 


St(lo) = log 


dPl°T] 


(3.7) 


(3.8) 


has the interpretation of “entropy production”, and the relative entropy density s(P|P) 
has the interpretation of “mean entropy production per unit time”. see e.g. pm. 


3.2 Law of large numbers 

For 5 > 0, we define the discrete-time Markov chain X s := {Xq,Xs,X 26 , • • This 
Markov chain has transition probabilities 

Ps(x,y) = ( e SL ) xy 

= l xy (l - 8c{x)) + 5c(x)p(x,y) + 0(8 2 ) (3.9) 

where 1 is the identity matrix. Similarly we define another Markov chain Y s with 
transition probabilities 

Ps(x,y) = (e 5L )xy 

= I xy (l - Sc(x)) + Sc(x)p(x,y) + 0(5 2 ). (3.10) 
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The path-space measure (on A N ) of X s , resp. Y'f is denoted by P 5 , resp. P 5 . From 
now on, we will write P < 5 (A"{ 1 ) instead of P^A^, X 2 s ..., X n s) to alleviate notations. 

We define waiting times, which are random variables defined on x A N , by 
setting 

W‘(X\Y) = inf{fe > 1 : (X ‘,.... X‘) = (Y‘ +1 ,Y ‘ +n )} (3.11) 

where we make the convention inf 0 = oo. In words, this is the first time that in a 
realization of the process Y 6 that one observes the first n symbols of a realization of 
the process X s . Similarly, if X' S is an independent copy of the process X s , we define 

W‘(X I A") = inf{* > 1 : (A'f,..., X s n ) = (Af +1 ,.... X’‘ + „)} . (3.12) 


We then have the following law of large numbers. 
Theorem 1 . (P <g) P ® P )-almost surely: 


lim lim —- log 

<5—>0 n —>oo no 


K(X\Y) 

W‘(X\X’) 


= s 


(3.14) 


Before proving this theorem, we state a theorem about the exponential approxi¬ 
mation for the hitting-time law, which will be the crucial ingredient throughout this 
paper. For a n-block re" := ... ,x n G A n and a discrete-time trajectory u G A N , 

we define the hitting time 

T^(uj) = inf {k > 1 : X {k+1)s = Wi,..., X {n+k+1)s = tv n +i} ■ (3.15) 

We then have the following result, see [Tj. 

Theorem 2. For all S > 0, there exist 771 , 772 , C, c, f3, k g ] 0, 00 [ such that for alln E N 
and for all x™ G A n , there exists r) = 77 ( x ™) ; with 0 < 771 < 77 < 772 < 00 such that for 
all t > 0 

Ce~ ct (P <5 (Aj l = xf)) K 
Ce~ ct e~ 0n . (3.17) 


P > P 5 (Af = xf) 


?-nt 


< 

< 


The same theorem holds with P replaced by P. 


The constants appearing in Theorem [21 (except C) depend on <5, and more precisely 
we have f3 = /3(S) —> 0, 771 = —> 0 as 5 —> 0. 

This is important in applications, since one wants to choose a certain discretization 
6 and then a corresponding “word-length” n(5) for the waiting times, or vice-versa. 
From Theorem [21 we derive (see j2J): 

Proposition 1. For all 5 > 0, there exist k 2 > 0 such that 

—Ki log n < log ^W^(A|T)P 5 (A”)^ < log(logn K2 ) P <g) P eventually a.s. 

and 

—Ki log n < log (Wn(X\X')F 6 (X™)') < log(logn K2 ) P (g) P eventually a.s. 
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With these ingredients we can now give the proof of Theorem d 


Proof of Theorem Q From Proposition Q] it follows that, for all h > 0, P® P ® P 
almost surely 


n— 1 


n —1 


lim - logW*(X\Y) - log W^{X\X') + V logp^pQ, X i+1 ) - Vlog pf{X u X, 

n—>oo n l L —+ 


i+lj 


= 0 . 


i =0 


i =0 


(3.19) 

By ergodicity of the continuous-time Markov chain {X t : t > 0}, the discrete Markov 
chains X s , Y s are also ergodic and therefore we obtain 


lim - (log W%(X\Y) - \ogW^(X\X') + V x)rf(x,y) log 
™ n \ xfy&A \P,5 (U V) 


Using (USD, (EHUli and p(x, x ) = 0, this gives 
1 

n—»oo Ti 


= o. 

(3.20) 


lim - (log W*(X\Y) - log W s n {X\X’)) 

’-+oo n 

^/i(x)( 1 -5c(a;))l o g( 1 -hc(a;)) - Y p{x)8c{x)p{x,y)\og{8c{x)p{x,y)) 


x£A x,y£A 

+ /i(a;)(l — 6 c(x)) log(l — Sc(x)) + p(x)5c(x)p(x, y) log(Sc(x)p(x, y)) + 0(5 2 ) 

x£A x,yEA 


c(x) 


5 I Y t*(x)c(x)p(x, y ) log + Y A x )(c(x) - c(x)) J + 0 (6 2 ) 

\x,y£A ' ' x£A / 


= 6 S 


XO{5 2 


(3.21) 


Combining this with (13.511 concludes the proof of Theorem d □ 

Let us now specify the dependence on <5 of the various constants appearing in 
Theorem [21 For the lower bound on the parameter we have (see [I], section 5) 


1 


Vi($) > c , + R 

where C is a positive number independent of 5 and 


(3.22) 


n/2 


K = 2 Y®V) + Y sup p 5 ( X i ( 

k^l {x[ n ~ k) } 


I Sf\r(n—k) _ (re—fc) 


= X\ 


1=1 


Here a(l) denotes the classical a-mixing coefficient: 

a{l) = sup sup (P 5 (^i n S 2 ) - F s (S 1 )¥ s {S 2 )) 

where Tf n is the Borel sigma-field on generated by Xf n (0 < m < n < oo). By the 
assumption of ergodicity of the continuous Markov chain, the generator L (resp. L) 
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has an eigenvalue 0 , the largest real part of the other eigenvalues is strictly negative 
and denoted by — Ai < 0 , and one has 

a(l) < exp(—A]5Z). (3.23) 

Using (US) there exists A 2 > 0 such that 

p 5( X (n-fc) = x (n-k)j < exp (—A 2 5n/2) 

for k = 1,. .., n/ 2 . Therefore, there exists c > 0 such that 

771 ( 5 ) > cS . (3.24) 

Similarly, from the proof of Theorem 2.1 in [2] one obtains easily the dependence on 
5 of the constants appearing in the error term of (IXT7D . 

c = c(5) > 7 i 5 , (3 — f3(8) > y 2 5 (3.25) 


for some 71,72 > 0 . 

In applications, e.g., the estimation of the relative entropy from a sample path, one 
would like to choose the word-length n and the discretization 5 = S n together. This 
possibility is precisely provided by the estimates and (IP! . as the following 

analogue of Proposition Q shows. 


Proposition 2. Let 5 n —> 0 as n —> 00 , then there exists tc 2 > 0 


< log {W^(X\Y)¥ S -(X^ < log 


and 


< log (wt(x \X')V S ~(X’;)) < log f^1^ 


P P eventually a.s. 

(3.27) 

P 0 P eventually a.s.. 


Proof. The proof is analogous to the proof of Theorem 2.4 in (2:. For the sake of 
completeness, we prove the upper bound (13.271) . We can assume that S n < 1. By the 
exponential approximation (EZ1) we have, for all t > 0 , n > 1 , the estimates 


P 5 0 P 5 (log (lU^(X|U)P' s (A7)) > log tj < e ~ v{ - 5n)t + Ce~^ Sn)n e- c{Sn)t 

< e~ m5nt + Ce~ ll5nn e ~ 125nt . (3.28) 

Choosing t — t n — K 2 l ogn , with tc 2 > 0 large enough makes the rhs of (I3.28j) summable 
and hence a Borel-Cantclli argument gives the upper bound. □ 


Of course, whether this proposition is still useful, i.e., whether it still gives the 
law of large numbers with 5 = S n depends on the behavior of ergodic sums 

n 

E 

1=1 
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under the measure P 5 ", i.e., the behavior of 




i =1 


under P. This is made precise in the following theorem: 

Theorem 3. Suppose that 5 n —> 0 as n —> oo such that —> 0 then in (P® PigiPj 

probability: 


1 , W^(X\Y) 

lim -— log TTTTTT7, = S 


(3.30) 


n-oo uS n b Wfr(X\X') 

Proof. By proposition [21 we can write 

\ogW 5 n n (X\Y)-\ogW 5 n "(X\X') 

= I x i =x i+1 log ! _ I x i ^x i+1 log + 0(log n/S n ) (3.31) 


1 -SnCiXi) ^ ^ l+1 b c(X.) 

The sum on the right hand site of (ED) is of the form 




2—1 


with 


E(F Sn -E(F Sn )) 2 <CS n 


(3.32) 


(3.33) 


where C > 0 is some constant. Now, using ergodicity of the continuous-time Markov 
chain {X t ,t > 0}, we have the estimate 

E((F 5 „(V J „,X (i+1 ) J J - E(UJ)(U.(A>„, AVdsJ - E(F,J)) < |F*,|I e-^'M 

(3.34) 

with Ai > 0 independent of n. 

Combining these estimates gives 


Var lj2 F K(X iK ,X (i 


, 2—1 


n 

Sr, 


i+i) Sn ) I < C7i5 n +J2 S n e- 5nMli ~ jl < Cn5 n +C'S, 

i=1 je{l,...,ra}\{i} 

(3.35) 

where C' > 0 is some constant. Therefore, 


by or ( J2 *&,(*«,, A( i+1)J .) ) = 0(l/nS 2 a ). 


n 2 51 


(3.36) 


. 2—1 


Combining (13.311) and (I3.36j) with the assumption —> 0 concludes the proof. □ 
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3.3 Large deviations 

In this subsection, we study the large deviations of 



( Wj(X\Y) \ 
\W>(X\X')) 


More precisely, we compute the large deviation generating function T 5 (p) in the limit 
5 —> 0 and show that it coincides with the large deviation generating function for the 
Radon-Nikodym derivatives c/P^/dlP 0 ’^. As in the case of waiting times for discrete¬ 
time processes, see e.g. [3j, the scaled-cumulant generating function is only hnite in 
the interval (—1,1). 

For the sake of convenience we introduce the function 

1 / dlP 10 ’*] 

£(p) := lim £ s (p) = lim -logE P ( —- 

5^o t-oo t \ dP l °’ t] 



lim - log Ep 

t —>OO t 


exp p 


C{UJ S ) 

c(u 8 ) 


dN s (u ) 



c(u s )) ds 


(3.37) 


By standard large deviation theory for continuous-time Markov chains (see e.g. EH) 
this function exists and is the scaled-cumulant generating function for the large de¬ 
viations of 




c(u> s )) ds 


as t —> oo. 

We can now formulate the following large deviation theorem. 


Theorem 4. For all p G M and S > 0 the function 



exists, is finite in p £ (—1,1) whereas 

tF s (p) = oo for \p\ > 1. 


(3.39) 


Moreover, as 5 ^ 0, we have, for all p G (—1,1): 

T(p) := lim iF S (p) — £(p). 

< 5—»0 

The following notion of logarithmic equivalence will be convenient later on. 

Definition 1. Two non-negative sequences a n , b n are called logarithmically equivalent 
(notation a n ~ h n ) if 

lim — (log a n - log b n ) = 0 . 

n—»oo 77, 
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Proof. To prove Theorem 01 we start with the following lemma. 
Lemma 1. 1. For all 5 > 0 and for \p\ <1, 

'rf(Xi,X i+1 ) 


n —1 


( W^(X\Y) \ P _r 

W*{X\X') J ~ Ep * exp ( p log 


i =0 


2. For \p\ > 1, 


lim — 

n—^oo Tl 


pJ(X u X i+1 ) 
{W*(X\Y)\ P 


\W*(X\X')J 

Proof. The proof is similar to that of Theorem 3 in [S]. 


= (X) , 


(3.42) 


(3.43) 


E 


(K(x\ y) y 

f W‘(X\X')) 

y r * (X n _ ( pyytYX ' r 


x E- 


T x n{Y 5 )r{Yf = xf) 
T x u(X /S )F s ( X’f = x ■?) 

P 5 (A7 = xf) 1+p F s (Yf = x?)" p E, 


Xl,...,Xn 


^ P 

Cn 


where 

and 


f n = T x n(Y 5 )F 5 (Y 1 n = x n 1 ) 


C n = T x n(X ,s )F s {X" = x'fi. 

The random variables Cn, Cn have approximately an exponential distribution (in the 
sense of Theorem H and are independent. Using this, we can repeat the arguments 
of the proof of Theorem 3 in [3] -which uses the exponential law with the error-bound 
given by Theorem [21 to prove that for p G (—1,1) 

'^n NP 

where C\ , C 2 do not depend on n, whereas for \p\ >1, 

'£ \ p 
S71 


0 < Cl < E 


< Co < oo 


E- 


= oo. 


(3.44) 


Therefore, with the notation of Definitional for |p| < 1 


E- 


fW*(X\Y) Y 

P^P^P* l W5(X\X') ) 

=x lt ...,X n = x n ) 1+p P 5 (y x = Xl ,...,Y n = x n ) 

pf(X h X i+1 ) 


-p 


Xl,...,X 


= Epi exp ( p ^2 lo S 
2—1 


P J(X h X i+ 1) 


(3.45) 
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and for \p\ > 1 we obtain (13.431) from (13.441) . 


□ 


This proves the existence of X s (p). Indeed, the limit 

X 5 {p) = lim — logE P « exp (P^T log f ( 3 - 46 ) 

n ^°° n \ \P 5 ( X i’ X i+i)J J 

exists by standard large deviation theory of (discrete-time, finite state space) Markov 
chains (since 5 > 0 is fixed). 

In order to deal with the limit 5 —> 0 of X s (p), we expand the expression in the 
rhs of (irm up to order 6 2 . This gives 


Eps exp 



/ pf(A,,A M ) \\ 
\pj(x„x i+1 ))j 




( ( v-, fv„i'. + , + &(v)p(v,A' 1+ i)-fc(v)T 

( eXP ( P S l0g /*„*.+,+«(A.) P (A.W +1 ) - de(^) J J 


= e 0{ - np) E ¥ s 


n 

exp (p£*/(* 
2=1 


X i+l ) {c{Xi) - c(W))) 


+ pJ2t(Xi^X i+1 ) log 

2= 1 


c(Xj) I 

c(Xi) - 


= e° {n52) E ¥ 


exp (p^51(X i5 = X( i+ i) 5 ) (c( X iS ) - c(X iS ))) 
2=1 


+ P J] 1 ( Xi5 ^ ) lo § 

2=1 


c{Xis) - 
c(X w ) - ' 


Next we prove that for all K E R 

exp (if £?=i (<fl(X i(S = X (m)5 )(c(X il5 ) - c(X w )) + l(X j5 + X {i+l)s ) log Sfg} 


log E P , 


= e>(?r<5 2 ). 


exp (if f 0 nS (c(X s ) - c(X s ))ds + X f Q nS log °^dN s ) 


(3.47) 


This implies the result of the theorem by a standard application of Holder’s in¬ 
equality, see e.g., [5|. We first consider the difference 


A(n, 5) := I J] l(X iS ± X (m)5 ) log 


2=1 


Tiog^iv 

c(X i5 ) ,/ 0 g c(X s ) d s 
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If there does not exist an interval [i8, (i + l)h[, i G {0,... ,n — 1} where at least 
two jumps of the Poisson process {N t , t > 0} occur, then A(n, h) = 0. Indeed, if there 
is no jump in [iS, (i + 1 )8[, both l(X i5 ^ X (i+ i )(5 ) log and f}g +1)S log ^^dN s 
are zero and if there is precisely one jump, then they are equal. Therefore, using the 
independent increment property of the Poisson process, and the strict positivity of 
the rates, we have the bound 


A(n,5)<Cj2 I (Xi> 2) 

2=1 

where the y^s, i — 1,..., n, form a collection of independent Poisson random variables 
with parameter <5, and C is some positive constant. This gives 

E P se 2KAM = ( (D(8 2 )e 2K + 0(l)) n = (D(e nS2 ). (3.48) 


Next, we tackle 


B(n, h) 


= X ii+1)s ){c(X i5 )-c(X i5 )) 

2=1 



c(X s ))ds 


(3.49) 


If there is no jump in any of the intervals [iS, ( i + l)h[, this term is zero. Therefore is 
is bounded by 

n 

B(n,8)<C'6j2nXi> 1 ) 

2=1 

where the XjS, i = 1 ,...,n, form once more a collection of independent Poisson 
random variables with parameter 8 , and C' is some positive constant. This gives 

E P e 2 kbM < (o(8 e c " 6 ) + 1 - 8 ) n = 0(e n32 ) 


where C" is some positive constant. Hence, (H32) follows by combining (jUSD and 
(13.4911 and using Cauchy-Schwarz inequality. □ 


The following propoisition is a straightforward application of Theorem 0] and [8J. 


Proposition 3. For all 8 > 0, T 6 is real-analytic and convex, and the sequence 
(loglT^A'lF) — logW*(X\X') : n G N} satisfies the following large large deviation 
principle: Define the open interval (c_,c + ), with 

d£ s 

c± := lim —— < 0 
p^±i dp 


Then, for every interval J such that J n (c_, c+) ^ 0 


lim — log P 5 <8) P 5 ® P 5 

n—xyo Tl 



( W*(X\Y) \ 
\W‘(X\X')) 



= - inf l s (q) 

q£ Jn(c_ ,c+) 


where T 5 is the Legendre transform of X s . 
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Remark 2. In the case {Y t : t > 0} is the time reversed process of {X t : t > 0}, the 
cumulant generating function function £(p) satisfies the so-called fluctuation theorem 
symmetry 

£(p) = £(-i ~P) ■ 

The large deviation result of Theorem then gives that the entropy production esti¬ 
mated via waiting times of a discretized version of the process has the same symmetry 
in its cumulant generating function for p e [0,1]. 


3.4 Central limit theorem 


Theorem 5. For all 5 > 0, 

1 


log 


W‘jX\Y) \ 

W‘(X\X')J 


— ns 


converges in distribution to a normal law Af (0, erf), where 


(jfi = lim —Var ( log 


n—xx Tl 


( p 5 (A7) 


Vp 5 (X[ 


Moreover 


where 


1 


lim — erf = 9 2 

5-^0 S 2 


6 2 = lim -Var ( log 


t— >oo t 

Proof. First we claim that for all 5 > 0 


/dP [0 '*] 






lim —E 

n—xx Tl 


log 


W*(X\Y) \ ^ p}(X„X, +1 ) 

W‘(X\X')) ^ g ppx h x i+1 ) 


= 0 


(3.53) 


This follows from the exponential law, as is shown in |3], proof of Theorem 2. 

Equation (13.531) implies that a CLT for (log W%(X\Y) — \ogW^{X\X'f) is equiv¬ 


alent to a CLT for log 


pf{X u X i+1 ) 

P J (x it x i+1 ) 


and the variances of the asymptotic normals 


are equal. For 5 fixed, V", log satisfies the CLT (for 5 > 0 fixed, X t is 

a discrete-time ergodic Markov chain), so the only thing left is the claimed limiting 
behavior for the variance, as 5 —> 0. 

As in the proof of the large deviation theorem, we first develop up to order 6 : 

pf(X u X i+1 ) 


5>g ; 


i= 1 
n 


pJ(Xi,X i+ 1) 


i =1 


l{Xi = X i+1 )5(cpQ - c(X i )) + ^ l(Xi T X, 

i— 1 
n 

Y& + c !) ■ 


i +1 


log 


c(Xj) 

c(X t ) 


i— 1 
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It is then sufficient to verify that 


1 Jh , (( /■(*+ 1 ) 5 \ 2 / /■(*+ 1)5 r ( v \ \ 

SP”(c(V) - c(A',))d S J +^-/ a 

which is an analogous computation with Poisson random variables as the one used in 
the proof of Theorem 01 □ 

4 Shadowing a given trajectory 

Let 7 G .D([0, oo),V) be a given trajectory. The jump process associated to 7 is 
defined by 

^(7) = X] I( ^ s ~ ^ 7s+ ) • 

0 <s<t 

For a given 5 > 0, define the “jump times” of the 5-discretization of 7 : 

s n(7) = {i G {1,.. •, n} : 7 (i _i){ ^ 7 * 5 } • 

For the Markov process {X t ,t > 0} with generator 

Lf(x ) = ^c(x)p(x, y)(f(y) - f{x)) 

y&A 


dehne the hitting time 

T 6 n (l\X) = inf {k > 0 : (X k5 ,X {k+n)s ) = 7 (5, ...,n5)}. 


In words this is the first time after which the 5-discretization of the process imitates 
the 5-discretization of the given trajectory 7 during n time-steps. For fixed 5 > 0, 
the process {X n $\ n G N} is an ergodic discrete-time Markov chain for which we can 
apply the results of |Ij for hitting times. More precisely there exist 0 < Ai < A2 < 00 
and C, c, a > 0 such that for all 7, n G N, there exists Ai < Ajj < A 2 such that 


P(^( 7 |X)P(X W = 7*,Vi 


1 ,... ,n+ 1 ) > tj 



< Ce~ ct e~ an . 


(4.1) 


As a consequence of EU> we have 

Proposition 4. For all 5 > 0, there exist ki,k ,2 > 0 such that for all 7 G I5([0, 00), X), 
P ® P eventually almost surely 

r log n < log (T^\Y)F(Y 5 = 7 ,, • • •, ^ = 7 ^)) < log(logn K2 ) 


and 

1 logn < log (t£( 7 |V)P(X 5 = 7 ..., X nS = 7 ^)) < log(logn K2 ). 


15 





Therefore, for 5 > 0 fixed, we arrive at 


logr*( 7 |X)= (4,3) 

E lo g( fc (7(i-i)«)p(7(i-i)S,7ij)) + 53 l°g(l — <Sc(7 (i _i)i)) + o(n). 

*€££(7) ie{l,...,ra}\E&(7) 


The presence of the log(<5) term in the rhs of (14.41) causes the same problem as we 
have encountered in Sectional Therefore, we have to subtract another quantity such 
that the log(<5) term is canceled. In the spirit of what we did with the waiting times, 
we subtract log T 6 n ( 7 1E), where {Y t : t > 0} is another independent Markov process 
with generator 

Lf{x) = ^c(x)p{x,y)(f(y) - f{x )). 

y&A 


We then arrive at 


log 


Ttin\x) 

T*h\X) 


E 

ies*( 7 ) 


^ c('7(i-i)s)p(7(i-i)s,7is) | 

c(7(i-i)s)Ph(i-i)5,7is) 


lo s 

ie{l,...,n}\E*( 7 ) 


(1 - <^c(7 (i _i )g )) | o 

(1 - <5c(7 (i _i)i)) 

(4.4) 


We then have the following law of large numbers 


Theorem 6. Let P (resp. PJ denote the stationary path space measure of {X t : t > 0} 
(resp. {Y t : t > 0} and let 7 G -D([0, 00 ), A") be a fixed trajectory. We then have P®P- 
almost surely: 


1 ( T s h\Y) 
lim lim — log " y J 

5 — >o n—>00 nS y T^(y|A) 

= 0 . 



c(7s)p(7s-,7s+) 

c(7s)p(7 s -,7s+) 


dN a { 7) 



c ( 7 s))dsj 
(4.6) 


Moreover, if 7 is chosen according to a stationary ergodic measure Q on path-space, 
then Q -almost surely 

^( logT -( 7 i y )- logT -(7i x )) = y ) log ftxww V 1 q( x M x )-c( x )) 

x,y£A ^ xeA 

(4.7) 

where 

q(x, y ) = lim E Q 

t—>00 

q(x) = Q(7 o = x) 

and where Nf y ( 7 ) denotes the number of jumps from x to y of the trajectory 7 in the 
time-interval [ 0 , t]. 

Proof. Using proposition EJ we use the same proof as that of Theorem [TJ and use that 
the sums in the rhs of (Q is up to order 5 2 equal to the integrals appearing in the 
lhs of (14.611 . The other assertions of the theorem follow from the ergodic theorem. □ 
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Remark 3. If we choose 7 according to the path space measure P, i.e., 7 is a “typical” 
trajectory of the process {X t : t > 0}, and choose p(x,y) = p(x,y), then we recover 
the limit of the law of large numbers for waiting times (Theorem, QJj: 


( logT n(7l y ) “ lo g T n(7l^0) = ^/x(a;)c(x) lo g^+^/x(a;)(c(x)-c(x)) = s 


c(x 
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